Discrimination between Singing and Speaking Voices Using Local and Global Characteristics

نویسندگان

Yasunori OHISHI

Masataka GOTO

Katunobu ITOU

Kazuya TAKEDA

چکیده

Discriminating between singing and speaking voices by using the local and global characteristics of voice signals is discussed. From the results of subjective experiments, we show that human beings can discriminate singing and speaking voices with more than 70.0% and 99.7% accuracy from 200 ms and one second long signals, respectively. From the subjective experiment results, assuming that different features are effective for short-term and long-term signals, we designed two measures using a spectral envelope (MFCC) and the fundamental frequency (F0, perceived as pitch) contour. Experimental results show that the F0 measure performs better than the spectral envelope measure when the input voice signals are longer than one second. Particularly, it can discriminate singing and speaking voices with 85.0% accuracy with two-second signals. On the other hand, when the input signals are shorter than one second, the spectral envelope measure performs better than the F0 measure. Finally, by simply combining the two measures, 87.5% accuracy is obtained for two-second signals.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Discrimination between Singing

متن کامل

On Human Capability and Acoustic Cues for Discriminating Singing and Speaking Voices

In this paper, acoustic cues and human capability for discriminating singing and speaking voices are discussed to develop an automatic discrimination system for singing and speaking voices. Based on the results of preliminary subjective experiments, listeners discriminate between singing and speaking voices with 70.0% accuracy for 200-ms signals and 99.7% for one-second signals. Since even shor...

متن کامل

Discrimination between singing and speaking voices

متن کامل

Development of the F0 Control Model for Singing-Voices Synthesis

Fundamental frequency (F0) control models for singing voices are required to construct singing-voice synthesis systems that can generate natural singing-voices. This paper describes the development of an F0 control model for singing-voices synthesis. F0 fluctuations are revealed as characteristics that need to control the F0 contour of singing-voices by investigating how much they influence sin...

متن کامل

Speakbysinging: Converting Singing Voices to Speaking Voices While Retaining Voice Timbre

This paper describes a singing-to-speaking synthesis system called “SpeakBySinging” that can synthesize a speaking voice from an input singing voice and the song lyrics. The system controls three acoustic features that determine the difference between speaking and singing voices: the fundamental frequency (F0), phoneme duration, and power (volume). By changing these features of a singing voice,...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2011

Discrimination between Singing and Speaking Voices Using Local and Global Characteristics

نویسندگان

چکیده

منابع مشابه

Discrimination between Singing

On Human Capability and Acoustic Cues for Discriminating Singing and Speaking Voices

Discrimination between singing and speaking voices

Development of the F0 Control Model for Singing-Voices Synthesis

Speakbysinging: Converting Singing Voices to Speaking Voices While Retaining Voice Timbre

عنوان ژورنال:

اشتراک گذاری